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Abstract — Codes in the Grassmannian space have found 
recently application in network coding. Representation of k- 
dimensional subspaces of F™ has generally an essential role in 
solving coding problems in the Grassmannian, and in particular 
in encoding subspaces of the Grassmannian. Different represen- 
tations of subspaces in the Grassmannian are presented. We use 
two of these representations for enumerative encoding of the 
Grassmannian. One enumerative encoding is based on Ferrers 
diagrams representation of subspaces; and another is based on 
identifying vector and reduced row echelon form representation 
of subspaces. A third method which combine the previous two 
is more efficient than the other two enumerative encodings. 



I. Introduction 

Let Fq be a finite field of size q. The Grassmannian space 
(Grassmannian, in short), denoted by Q q (n, k), is the set of 
all /c-dimensional subspaces of the vector space F™, for any 
given two nonnegative integers k and n, k < n. A code C 
in the Grassmannian is a subset of Q q (n, k). 

Koetter and Kschischang [1] showed the application of 
error-correcting codes in Q g (n, k) to random network cod- 
ing. This application has motivated extensive work in the 
area [2], [3], [4], [5], [6], [7], [8]. On the other hand, the 
Grassmannian and codes in the Grassmannian are interesting 
for themselves [9], [10], [11], [12], [13]. A natural question 
is how to encode/decode the subspaces in the Grassmannian 
in an efficient way. To answer this question we need first 
to give a representation of subspaces, order all of them, and 
encode/decode them based on this representation and order. 

Cover [14] presented a general method of enumerative 
encoding for a subset S of binary words. Given a lexico- 
graphic ordering of S, he presented an efficient algorithm for 
calculating the index of any given element of S (encoding). 
He also presented an inverse algorithm to find the element 
from S given its index (decoding). Our goal in this paper 
is to apply this scheme to all subspaces in a Grassmannian, 
based on different lexicographic orders. 

First, we present the encoding scheme of Cover [14]. Let 
{0, 1}™ denote the set of all binary vectors of length n. Let 
S be a subset of {0, 1}™. Denote by ns{xi,x 2 , ■ ■ ■ ,Xk) the 
number of elements of S for which the first k coordinates 
are given by (x 1 ,x 2 ,-- -,x k ). 

The lexicographic order is defined as follows. We say that 
for x, y £ {0, 1}", x < y, if Xk < yk for the least index k 
such that Xk ^ yk- For example, 00101 < 00110. 



Theorem 1: [14] The lexicographic index of x £ S is 



ind s (x) = 2_\ x j ' n s(xi,x 2 , 



0). 



Remark 1: The encoding algorithm of Cover is efficient if 
ns(xi, X2, ■ ■ ■ , Xj-i, 0) can be calculated efficiently. 

Let S be a given subset and i be a given index. The 
following algorithm finds x such that inds(x) = i. 

Inverse algorithm [14]: For k — 1, ...,n, if i > 
ns{xi, x 2 , ■ ■ ■ ,Xk—i, 0) then set Xk = 1 and i = i — 
ns{xi,x 2 , ■ ■ ■ , Xk-i, 0); otherwise set Xk = 0. 

Cover [14] also presented the extension of these results 
to arbitrary finite alphabet. For our purpose this extension 
is more relevant as we will see in the sequel. The for- 
mula for calculating the lexicographic index of x € S C 
{1, 2, 3, ... , M} n is as follows. 



inds(x) 



EE 

j — l m<Xj 



ns(xi,x 2 , ■ ■ ■ >Xj-i,m). 



(1) 



Cover didn't prove the correctness of this formula and didn't 
present the inverse algorithm. We will present some of these 
omissions for our decodings in the sequel. 

In our work we present three different ways for enumera- 
tive encoding of the Grassmannian. One is based on Ferrers 
diagrams ordering; another is based on the identifying vectors 
combined with the reduced row echelon form ordering; and 
the third one is a combination of the first two. 

The rest of this paper is organized as follows. In Section fill 
we discuss different representations of subspaces in the 
Grassmannian. We define the reduced row echelon form of a 
/j-dimensional subspace and its Ferrers diagram. These two 
structures combined with the identifying vector of a subspace 
will be our main tools for representation of subspaces. In 
Section Hn] we define an order of the Grassmannian based on 
Ferrers diagrams representation and present the first enumer- 
ative encoding method. In Section [IV] we define another lexi- 
cographic order of the Grassmannian based on representation 
of a subspace by its identifying vector and its reduced row 
echelon form and describe the second enumerative encoding 
method. In Section [V] we show how we can combine two 
encoding methods mentioned above. Finally, in Section [VI] 
we summarize our results and discuss further applications 
of the different orders of the Grassmannian. This leads for 
further results and problems for future research. 



II. Representation of Subspaces 

In this section we give the definitions for two structures 
which are useful in describing a subspace in Q q (n, fc), i.e., 
the reduced row echelon form and the Ferrers diagram. The 
reduced row echelon form is a standard way to describe a 
linear subspace. The Ferrers diagram is a standard way to 
describe a partition of a given positive integer. Based on these 
two structures and the identifying vector of a subspace we 
will present a few representations for subspaces which will 
be the key for our enumerative encodings. 

A fc-dimensional subspace X 6 F™ can be represented by 
a fc x n generator matrix whose rows form a basis for X. 
To have a unique representation of a subspace, we use the 
following definition. 

A k x n matrix with rank k is in reduced row echelon form 
(RREF in short) if the following conditions are satisfied. 

• The leading coefficient of a row is always to the right 
of the leading coefficient of the previous row. 

• All leading coefficients are ones. 

• Every leading coefficient is the only nonzero entry in 
its column. 

We represent a subspace X of a Grassmannian by its 
generator matrix in RREF. There is exactly one such matrix 
and it will be denoted by RE(X). 

Example 1: We consider the 3-dimensional subspace X of 
with the following eight elements. 
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The generator matrix of X in RREF is given by 
/ioooiio\ 

RE{X) = 0010101. 

yoooioiiy 

Remark 2: It appears that designing an enumerative en- 
coding of the Grassmannian based on this representation 
won't be efficient and we need to find other representations 
of a subspace for this purpose. 

Each fc-dimensional subspace X of F™ has an identifying 
vector v(X) [7]. v(X) is a binary vector of length n and 
weight k, where the ones in v(X) are in the positions 
(columns) where RE(X) has the leading coefficients (of the 
rows). 

Remark 3: We can consider an identifying vector v(X) for 
some fc-dimensional subspace X as a characteristic vector 
of a fc-subset. This coincides with the definition of rank- 
and order-preserving map <f> from G q (n, k) onto the lattice 
of subsets of an n-set, given by Knuth [9] and discussed by 
Milne [10]. 

Example 2: Consider the 3-dimensional subspace X of 
Example Q] Its identifying vector is v(X) = 1011000. 

Remark 4: For a representation of a fc-dimensional sub- 
space X we only need v(X) and the fc x (n — k) matrix 



formed by the columns of RE(X) which correspond to the 
zeroes in v(X). 

Remark 5: A somewhat less compact way to represent a 
fc-dimensional subspace X is to form a (fc + 1) x n matrix 
where the first row is the identifying vector, v(X), and the 
last fc rows form the RREF of X, RE(X). We will see in 
the sequel that this representation will be very useful in our 
encoding algorithms. 

Example 3: Consider the subspace X of Example Q] Its 
representation bya(fc + l)xn matrix is given by 

/1011000\ 
1 1 1 
10 10 1 

V0001011/ 

A partition of a positive integer m is a representation of 
to as a sum of positive integers. The partition function p(m) 
is the number of partitions of m [15], [16]. 

Example 4: One of the possible partitions of 21 is 6 + 5 + 
5 + 3 + 2 and p{2l) = 792. 

A Ferrers diagram T represents a partition as a pattern of 
dots with the i-th row having the same number of dots as 
the i-th term in the partition [15], [16]. A Ferrers diagram 
satisfies the following conditions. 

• The number of dots in a row is at most the number of 
dots in the previous row. 

* All the dots are shifted to the right of the diagram. 

Let \T\ denote the size of T, i.e., the number of dots in T. 

Example 5: For the partition of Example @] the Ferrers 
diagram T, \T\ = 21, is given by 



T = 



The echelon Ferrers form of a vector v of length n and 
weight fc, EF(v), is the fc x n matrix in RREF with leading 
entries (of rows) in the columns indexed by the nonzero 
entries of v and " • " in all entries which do not have terminal 
zeroes or ones. A " •" will be called in the sequel a dot. The 
dots of this matrix form the Ferrers diagram of EF(v). If we 
substitute elements of F g in the dots of EF(v) we obtain a 
fc-dimensional subspace X of Q q (n, fc). EF(v) will be called 
also the echelon Ferrers form of X. 

Example 6: The echelon Ferrers form of the vector v — 
1011000 is 

/l«00«»«\ 
EF(v) = 0010»»» . 
\ 0001»»»/ 



The Ferrers tableaux form of a subspace X, denoted by 
F{X), is obtained by assigning the values of RE(X) in 
the Ferrers diagram of EF(v(X)). 

Remark 6: F(X) defines a representation of X. 



Example 7: For the subspace X, given in Example Q] 
whose echelon Ferrers form given in |6l the Ferrers tableaux 
form is 



F{X) 



110 
1 1 
Oil 



III. Encoding based on Ferrers Tableaux Forms 

In this section we present an encoding of the Grassmannian 
based on the Ferrers tableaux form representation of fc- 
dimensional subspaces. The number of dots in a Ferrers 
diagram of a fc-dimensional subspace is at most k ■ (n — k). 
It can be embedded in a k x [n — k) box. We define a 
lexicographic order of such Ferrers diagrams, which induces 
an order of subspaces in the Grassmannian, and then apply 
the enumerative encoding to all fc-dimensional subspaces. 

The order that we define in the sequel is based on the 
following theorem [15] which shows the connection between 
the number of fc-dimensional subspaces of F™, denoted by the 

n 
k 



g-ary Gaussian coefficient 



, and partitions. 



Theorem 2: For any given integers k and n, k < n, 



n 
k 



k(n — k) 

£ 

1=0 



where the coefficient a? is the number of partitions of I 
whose Ferrers diagrams fit in a box of size k x (n — k). 

A. Encoding of Ferrers Diagrams 

Let T be a Ferrers diagram of size rn embedded in a 
k x (n — k) box. We represent T by an integer vector of 
length n — k, {T n -k^ ■■■,J~ r 2,J 7 i), where Ti is equal to the 
number of dots in the z-th column of J 7 , 1 < i < n — k, 
where we number the columns from right to left. Note that 
Fi+i < Fi, 1 < i < n — k — 1. 

Let T and T be two Ferrers diagrams of the same size. 
We say that T < T if Ti > Ti for the least index % such 
that Ti 7^ J~i, i.e., in the least column where they have a 
different number of dots, T has more dots than T. 

Let N m [Tj , T%, T\) be the number of Ferrers diagrams 
of size m embedded in a k x (n — k) box, for which the 
first j columns are given by [Tj,...,Ti,T\). The number 
of dots in column j of T is at most Hence, by (HJ 

the lexicographic index ind m of T among all the Ferrers 
diagrams with the same size m is given by 

n— k Fj-i 

ind m (J 7 ) = ^ £ Nmia^fj-i, ... ,^2, ^1), (2) 
j=i 0=^+1 

where we define Tq = k. 

Note that < ind m (T) < a m — 1, where a m is defined 
in Theorem 12 

Let p(m, k,r/)be the number of Ferrers diagrams of size m 
which are embedded in a fcx 77 box, i.e., p(m, k, n—k) = a m . 
The following lemma can be easily verified. 

Lemma 1: p(m,k,rj) satisfies the following recurrence 
relation: 

p(m, k, rf) = p(m — k, fc, r\ — 1) + p{m, k — 1, 77) 



p(ro, fc, i]) - 
with the initial conditions 



p(kr\ — m, k, rj), 



p(m, k, rj) = 0, if m < 0; 
p(m, 1, 77) = 1, if m < rj] 
p(m, k,l) = 1, if rra < k. 

Remark 7: Since p{m,k 1 rf) — p(krj — m,k,r)), we can 
assume that m < 

Now, using the definition of p(m, k, rj) we can calculate the 
size of N m {T ,...,T 2 ,Ti). 
Lemma 2: 

3 

N m {Tj, ^2, Fx) =p(m - y^^uF^n- k - j). 

i=i 

Lemma [2] implies that if we can calculate p(m, k, rf) 
efficiently then we can calculate efficiently ind m (J-") for 
Ferrers diagram of size m embedded in a k x (n — k) box. 

Given an index i, in a similar way to the inverse algorithm 
of Cover we can design an inverse algorithm to find the 
Ferrers diagram T such that i = ind m (T). 

Now, we can define an order of all Ferrers diagrams 
embedded in a k x (n — k) box. 

For two Ferrers diagrams T and T, we say that T < T 
if one of the following conditions holds 

• 1^1 > \f\ 

• \T\ = \T\, and ind\f\{T) < ind,^(J-). 

Example 8: For the three Ferrers diagrams T, and T 



we have T < T < T ' . 

B. Order based on the Ferrers Tableaux Forms 

Let X, Y £ G q (n, k) be two fc-dimensional subspaces, 
RE(X) and RE(Y) the related RREFs. Let v(X) and v(Y) 
be the identifying vectors of X and Y, respectively, and 
J~x, 3~y the related Ferrers diagrams of EF(v(X)) and 
EF(v{Y)). Let x\, x 2 , x\j? x \ and yt, 7/2, -, y\r Y \ be the 
entries of Ferrers tableaux forms !F{X) and J-(Y), respec- 
tively. The entries of a Ferrers tableaux form are numbered 
from right to left, and from top to bottom. 

We say that X < Y if one of the following conditions 
holds 

• Fx < 3~y] 

. T x = T Y , and {x x ,x 2 , < (yi,y2, ...,y\r Y \). 

Example 9: Let X,Y,Z,W £ 02 (6, 3) which are given 

by 



10 110 1 
RE{X) =(011101 
1 1 



1 1 1 

RE(Y) =(0 1 
1 1 1 



111 
F(X) =111 
1 



10 1 

, 

1 1 



110 10 1 
RE(Z) =(001101 
1 



110 10 1 

RE(W) = I 1 1 1 
1 1 



HZ) 



F{W) 



111 
1 1 




111 
1 1 
1 



From Example [8] we have Ty < Tx < Tz = Tw- Since 

(Z 1 ,2 2 ,---,Z| JC - Z |) = (1,1,0,1,1,1) < (W 1 ,W 2 ,-;W\ J : W \) 

: 1 . 1, 1, 1, 1, 1) it follows that Y < X < Z < W. 
C. Encoding Based on the Ferrers Tableaux Forms 

Now, we use the order defined above and Theorem [2] for 
enumerative encoding of Q q (n, k). Let {a;} be the integer 
value of vector x — (xi, ...,Xij^ x i) and let {i} q be the base 
q representation of the integer i. 

Theorem 3: Let X S G q (n, k), Tx be the Ferrers diagram 
of EF(v(X)), x±,X2, ...,X\f x \ be the entries of T(X). 
Then the index Indi(X), by the order based on the Ferrers 
tableaux forms, is given by 

k(n — k) 

Ind x {X) = a ^ + ind^ xl (T x )q^ x \ + {x}, 

where on is defined in Theorem [2] and ind\ :Fx \ is given by 
©. 

Now, an index i is given. The following algorithm returns 
a subspace X 6 Q q (n, k) such that Ind\(X) = i. 
Inverse algorithm: 

Step 1: If i < q k ( n ~k) tnen _ _ ^ ass jg n 

the values of {i} q to T(X) and stop; otherwise set i = 

i — qk(n-k)^ 

Step 2: For 1 < j < k(n-k), if i < a k(n ^ k) ^q k ^- k ^ , 
then \T X \ = k(n - k) - j, Tx = ™d^ x \ ( [ qHn - k) -j J )i 
assign the values of {i - \_ jg fc ("~ fc )^} g to T(X) 

and stop; otherwise set i — i — ctk(n-k)-jQ ki ' n ~ k ' , ~ : ' ■ 

Theorem 4: The complexity of the encoding/decoding 
based on the Ferrers tableaux forms is 0{k 5 / 2 {n — fc) 5 / 2 ). 

IV. RREF and Identifying Vector Encoding 

In this section we provide another method for enumerative 
encoding of the Grassmannian, based on the representation of 
a subspace X 6 G q (n, k) by a (k + 1) x n matrix whose first 
row is v(X) and the other k rows form RE(X). First, we 
define the lexicographic order in the Grassmannian based on 
this representation and then we apply enumerative encoding 
to the Grassmannian based on this representation. 

A. Order based on the Extended Representation 

Let X G Q q {n, k) be a fc-dimensional subspace. The 
extended representation EXT(X) of X is a (fc + 1) x 
n matrix obtained by combining the identifying vector 
v(X) = (v(X) n ,...,v(X)i) and the RREF RE(X) = 
(X n , . . . , Xi), as follows 



EXT(X) = 



v(X) n 



v(X) 2 v{X) x 
X 2 X\ 



Note, that v{X)i is the most significant bit of the column 

vector ^ "x?* )' 

Let X,y\ Qq{n,k) and EXT(X), EXT(Y) be the 
extended representations of X and Y, respectively. Let i 
be the least index such that EXT(X) and EXT(Y) have 
different columns. We say that X < Y if j j < 

{-«.}. 

Example 10: For I,Fe 5 2 (6,3) whose EXT(X), and 
are given by 



EXT(X) 



EXT(Y) 



/1 1 1 000\ 
1 1 
1 

V001100/ 





V 



\ 



/ 



we have X < Y. 

B. Enumerative Encoding Based on Extended Representation 



Let N y £ "' ^ J be the number of elements in 
G q (n, k) for which the first j columns in the extended 
representation are given by ( ^ Xi ) ' 

Remark 8: We view all the q-ary vectors of length k + 1 
as our finite alphabet. Let S be the set of all g-ary (k + 
1) x n matrices which form extended representations of 
some fc-dimensional subspaces. Now, we can use Cover's 
method to encode the Grassmannian. In this setting note 



that N 



( v i 



is equivalent to ns(xi,X2, 



where Xi 
1 ,e mind 3: 

N 



Vi 



Xi 



n - j 

j 



Theorem 5: Let X G G q (n, k) be represented by 
EXT(X) = ( 
Then the lexicographic index of X is given by 



X n 



1'2 

x 2 



Xi 



Ind 2 (X)=J2(vj 



k—Wj-i 



n-j 
k — Wj-i 



where Wj-i denotes the weight of the first rightmost j 
entries of v(X), i.e., Wj-i = 2~2e=i v ?- 

Example 11: Let X G £2(6, 3) be given by 



EXT(X) 



(» 



V 



By Theorem [5] we have 

Ind 2 {X) = b- J 



+ 2 d 





1 J 



' 2 " 




' 1 " 




+ 2- 


1 


2 


1 



928. 



Now suppose that an index i is given. The following algo- 
rithm finds a subspace X £ Q q (n, k) such that Ind,2{X) = i. 

Inverse algorithm: Set i = i. 

For j = 1, 2, n do: 

• if Wj-i > k then set v(X)j = 0, X,- = {0} q , and 



• otherwise 
- if ij-i > 



n-j 



k — w 



1, 



'i-i 



then set u(X)j 



{g^- 1 },, and i. 



k — voa- 



'i-i 



- otherwise let vaZ = 



n - j 
A; — ui j _ i 



and set 



9 

v(X)j = 0, Xj = {val ■ q Wj ~ 1 } q , and ij = — 
val ■ 



n — 3 
k —Wj—\ 



Theorem 6: The complexity of the encoding/decoding 
based on the extended representation is 0(nk(n — 
k) log n log logn). 

V. Combination of the Encoding Methods 

The only disadvantage of the Ferrers tableaux form en- 
coding is the computation of the aj's and ind\F x \(!Fx) 
in Theorem [3] This is the reason for its relatively higher 
complexity. The advantage of this encoding is that once 
these values are known, the algorithm becomes trivial. Our 
solutions for the computation of the ccj's and ind\j7 x \(Fx) 
are relatively not efficient and this is the main reason why 
we turned to enumerative encoding based of the RREF 
and the identifying vector of a subspace. The only disad- 
vantage of this enumerative encoding is the computation 
of the Gaussian coefficients in Theorem [5] It appears that 
a combination of the two methods is more efficient from 
the efficiency of each one separately. The complexity will 
remain 0(nk(n — fc) logn log logn), but the constant will 
be considerably reduced in the average. This can be done if 
there won't be any need for the computation of the a>i's and 
the computation of indt^t^x) will t> e simple. 

We note that most of the fc-dimensional subspaces have a 
Ferrers diagram with a large number of dots. We will encode 
these subspaces by the Ferrers tableaux form encoding and 
the other subspaces by the extended representation encoding. 
We will decide on a set of Ferrers diagrams which will 
be used for the Ferrers tableaux form encoding. They will be 
taken by a decreasing number of dots among all the Ferrers 
diagrams which can be embedded in a k x (n — k) box. 

We define a new function Ind in the following way: 



Ind(X) 



Ind x {X) 
Ind 2 (X) + A x 



Tx £ 
otherwise 



where Ax is the number of subspaces formed from Sjr, 
which are lexicographically succeeding X by the extended 
representation ordering. Similarly we will define an inverse 
algorithm. 



VI. Conclusion and Future Research 

Three methods for enumerative encoding of the Grass- 
mannian are presented. The first is based on the Ferrers 
tableaux form of subspaces. The second is based on the 
representation of subspaces by their identifying vector and 
reduced row echelon form. The complexity of the second 
method is superior on the complexity of the first one. The 
third method which is a combination of the first two reduces 
in average the constant in the first term of the complexity for 
the second method. Improving on these methods is a problem 
for future research. 

Enumerative encoding of the Grassmannian is based on 
representation and order of subspaces. Each such order 
defines a lexicographic code [17] with prescribed minimum 
distance (for two subspaces X, Y £ Q q (n, k) the distance 
between X and Y is defined by d(X, Y) — dimX+dim Y" — 
2 dim(XnY) [1]). It appears that some of these lexicodes are 
the best known. For example, based of the Ferrers tableaux 
form ordering we found a code with minimum distance 4 and 
size 4605 in &(8, 4) which is the largest known. Considering 
lexicographic codes in the Grassmannian is a topic for future 
research. There are some computational aspects involve in 
this computation and this is currently under consideration. 
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